Employing Text Analysis for a School District’s Special Education Program Review
Employing Text Analysis for a School District’s Special Education Program Review
Data Science for Public Policy Final Project
Author
Nick Coukoulis
Introduction
A suburban school district just outside a large city on the West Coast wished to conduct a third-party, independent review of their special education program following several high-profile staff departures from the district office and receiving negative press of their special education staff feeling “abandoned” in local news articles. The purpose of the review was to examine special education practices impacting climate, staffing, and communication within the district. The review aimed to identify strengths and areas of need for district leadership to better understand and improve their current system. The review was conducted over a 7-month period between October 2022 and May 2023.
The review began by gathering information from extant data available at the district-level, reviewing websites, documents (e.g., school board policies and procedures, student services employee exit interview notes), and data (e.g., numbers of special education staff no longer working in the district by year). In addition to reviewing the extant data, a staff survey, ten interviews, and seven focus groups were conducted.
This analysis aims to identify strengths and areas of need for district leadership to better understand and improve their current system. In this analysis, we contextualize and present text analysis results based upon the open-ended survey responses and focus group/interview transcripts. We conclude with recommendations grounded in the results and in knowledge of best practices supported by research that are being implemented successfully in similar districts nationwide.
Code
library(tidyverse)library(tidytext)library(lubridate)library(SnowballC)library(igraph)library(ggraph)library(stopwords)library(dplyr)theme_set(theme_minimal())# load one-row-per-line data ----------------------------------------------surveydata <-read_csv("/Users/nicholascoukoulis/Documents/Georgetown/MPP/Spring2023/DataScience/assignments/project/Groupproject/individualresponses_survey_updated_5.1.23.csv")surveydata <-filter(surveydata, !is.na(text))#dropping consent & working years since we've cleaned data to just include thesesurveydata <- surveydata[-c(1:2)]
Methods
Data Collection
Seven one-hour virtual focus groups with district-level administrators, building administrators, certified support staff (e.g., related service providers, school psychologists), special education teachers, general education teachers, and non-certified support staff (e.g., transportation staff, paraeducators) were conducted. We spoke with 60 staff total, with an average focus group size of 9 members.
Ten interviews were conducted that included current and former district staff serving in the district office. Roles interviewed included superintendent, assistant superintendent, itinerant services program manager, director of student support services, director of special education, manager of special education programs, director of psychology and counseling, deaf and hard of hearing program director, and president and uniserv representative of the teachers’ union.
All staff in the district (school-level and district-level) were invited to complete a survey administered via SurveyMonkey that included 33 Likert-scale items aligned with the research questions guiding the review, seven open-ended response questions, and six multiple choice questions. Estimated response time for the survey was 15-20 minutes.
The district has a total of 3,023 employees. A total of 1205 staff (40% of all staff) consented to take the survey, indicated they had worked in the district during the 2021-22 school year, and shared their role. Of this total:
417 (35%) were general education teachers;
308 (26%) were non-certified support staff (including paraeducators);
151 (13%) were related service providers, psychologists, or other certified support staff;
119 (10%) were special education teachers;
28 (2.3%) were school administrators;
18 (2%) were district administrators;
12 (1%) were transportation staff; and
152 (13%) identified as “other”
Individuals who selected “other” were asked to specify and responses included, but were not limited to, roles like custodian, music specialist, food service worker, substitute teacher, and ASL interpreter. Several certificated roles like ELL teacher, behavior and emotional support specialist, and certificated nurse selected “other” as well. The survey was open for two weeks in January 2023, from January 9 to January 24.
Data Analysis
Term Frequency - Inverse Document Frequency (TF-IDF)
A Term Frequency - Inverse Document Frequency (TF-IDF) analysis is used for almost all survey questions relating to climate and communication. “Term Frequency” (TF) measures the frequency of a word in a document. Because documents vary in length and longer documents are equally as important as shorter documents, normalization of the frequency value is required in the form of dividing the frequency by the total number of words in the document.
If a term doesn’t exist in a given document, the TF score would be 0. Likewise, if a document is comprised only of the same one term, the TF score would be 1. All TF scores fall within a range between 0 and 1.
“Inverse Document Frequency” (IDF) measures the informativeness of any term, t. An IDF value will be very low for “stop words,” or common words like “is” or “a” because they aren’t very informative on their own. In addition to those general “stop words”, we set domain-specific stop words such as the words which included in the survey questions are used a lot because people tend to repeat again in their answers in order to increase the accuracy.
Thus, the TF-IDF score is a measure of originality of term t calculated by comparing the number of times term t appears in a document with the number of documents term t appears in. A higher TF-IDF score indicates the importance and relevance of a term. Lower importance terms will have scores closer to 0. TF-IDF was selected for analysis for a quick summary of priorities/keywords across survey items and to easily compare and contrast priorities and concerns between roles. At the same time, it was utilized to categorize words into positive and negative meanings for setting the topics in topic modeling.
Bigrams
A bigram chart was used for one survey item related to ineffective communication. A “bigram” is an association between two individual terms. The bigram chart employs a frequency count of terms and is an effective way to visualize relationships between terms.
A bigram chart was selected as the best analysis method for this survey item as it most clearly demonstrated the relationship staff have with current district communication methods and most clearly identifies concerns being related to communications sent at the district-level.
Topic Modeling is a method for unsupervised classification of text analysis. Even when we do not know what should we analyze from text data, the strength of this model is finding some hidden groups of topics. In this paper, we first conducted the analysis of the question by question. Next, we employ topic modeling to find the most important predictors of negative and positive climate in the district across the survey data. For instance, some frequently used words such as “email” and “meeting” are important words but have multiple meanings, both negative and positive. Topic modeling can find the hidden meanings by annotating the document based on the predicted topic to find the most impactful predictors.
LDA is one method of topic modeling. LDA aims to find the topic of the document based on the words belonging to it. This method disregards grammar and other words.
Climate and Culture Facilitators
Considering your school, what factors do you believe are presently or would be facilitators of a positive climate and culture related to special education?
A single-term TF-IDF approach was applied here to determine the most frequent, unique terms related to climate and culture facilitators.
Climate Barriers
Considering your school, what factors, if any, do you believe are barriers that inhibit a positive climate and culture related to special education?
A bigram TF-IDF approach was applied to determine the most frequent, unique pairs of terms related to climate and culture barriers.
Effective Communication
Considering both district- and school-level communications, which practices do you perceive as effective?
A bigram TF-IDF approach was applied to determine the most frequent, unique pairs of terms related to effective communication.
Ineffective Communication
Considering both district- and school-level communications, which practices do you perceive as ineffective or inefficient?
A straightforward frequency count of bigrams was applied to clearly display relationships between terms and responsible parties/methods for ineffective communication.
Communication Recommendations
What recommendations do you have for improving communication practices in Edmonds Public Schools?
A bigram TF-IDF approach was applied to determine the most frequent, unique pairs of terms related to communication recommendations.
Firstly, two topics representing negative and positive climates were set. TF-IDF was conducted with all respondent survey data and negative and positive words were associated with two questions each; negative responses were associated with the “climate barriers” and “ineffective communication” questions, while positive answers were associated with ”effective communications” and ”communication recommendations” questions.
Calculating Probabilities
Word probabilities across survey questions regarding climate barriers, effective commmunication, ineffective communication, and communication recommendations.
Finding the best model
In the coefficients analysis, the word “lack” has the biggest impact for predicting negative climate in the school workplace. Regarding predictors for positive climate, “meeting” has meanings both negative and positive, yet is more likely to be used in a positive context based on this analysis.
Regarding variable importance, “lack” is the best word in the survey data predicting a negative climate. Term importance significantly decreases after “lack.” “Enough” and “student” round out the top three, and are also predictors of negative climate.
As a whole, negative climate predictors received higher scores than predictors for positive climate. These scores indicate negative climate factors could be easier to identify and eliminate than goals around creating positive climate.
The Random Forest model is a better model compared to Lasso model in terms of RSME.
A review of the data revealed the following strengths:
Strength 1: Staff remain focused on best serving students and families.
We see “kids” appear under Climate and Culture Facilitators and parent communication app “parent square” appear under Effective Communication.
Strength 2: Generally effective communication among staff and administrators at the building-level.
Under Effective Communication, “building admin,” “school level,” or other similar terms are quite frequent.
Strength 3: At all levels, staff acknowledge the district possesses a strong, qualified group of educators.
Strength 4: Recognition the district was and is a well-regarded school district families locate to so their children can receive a high-quality education and special education services.
The 2020-21 and 2021-22 school years were particularly challenging with the COVID-19 pandemic, but were challenging for the district in other ways with changes in leadership (those years appear as consistent reference points throughout the text analysis). Given the recency of these changes and new practices not yet becoming entrenched, there is still time for leadership to course-correct and steer the district back to being a place students, families, and staff wish to be part of.
Recommendations
A review of the data, including the text analysis review, strongly supports implementation of the following recommendations:
Recommendation 1: Establish a cross-district advisory committee for special education to address priority concerns.
Establishing a cross-district advisory committee for special education is a proactive and inclusive approach to address priority concerns and improve the educational experiences of students with special needs. This committee would bring together key stakeholders from across the district, including educators, administrators, parents, community members, and experts in the field of special education. By fostering collaboration and shared decision-making, the committee should aim to identify, discuss, and develop strategies to address the specific challenges faced by students with disabilities.
The committee’s primary purpose is to create a platform for open dialogue, information sharing, and problem-solving regarding special education issues. By convening regular meetings, members can discuss and analyze the priority concerns affecting students with special needs, such as curriculum adaptations, teacher training, inclusive practices, resource allocation, and parental involvement.
Recommendation 2: Create a dedicated special education director position whose sole responsibility is to oversee special education.
Creating a dedicated special education director position whose sole responsibility is to oversee special education is an important step in ensuring that students with special needs receive the support and resources they require. Leadership in this capacity is currently highly desired and this position would play a crucial role in managing and improving special education programs within the district, addressing the unique needs of students with disabilities, and advocating for their rights.
Many questions from staff (see recommendation 3) go unanswered by district administrators because they do not have the answers to questions and cannot get in touch with the assistant superintendent currently tasked with answering them. This director position would act as an intermediary to answer staff questions immediately.
Recommendation 3: Develop plan to implement relationship building strategies in addition to communication.
Developing a plan to implement relationship building strategies in addition to communication is essential for fostering positive connections and strengthening bonds within any personal or professional interaction within the district. While effective communication is crucial, it is equally important to engage in activities and strategies that go beyond mere information exchange.
Recommendation 4: Set expectations for administrative visits to classrooms and responding to emails and phone calls. Ensure there are enough of them to meet those expectations.
Expectations need to be set for district administrators to visit classrooms, respond to emails and phone calls (in a timely way), and meet the expectations of staff and administrators in schools. Across roles, many staff cite a lack of responsiveness to their phone calls and emails as barriers.
“Immediately reinstituting job alike meetings” refers to the practice of organizing meetings or gatherings that bring together individuals who hold similar positions or work in similar roles within the district. These meetings are aimed at facilitating collaboration, knowledge-sharing, and professional development among employees who share common job responsibilities or face similar challenges.
Evident from the text analysis (see the Effective Communication and Communication Recommendations sections) is staff members’ attachment to these meetings and the value they see in them. Emails and robo calls have limited effectiveness and aren’t well-liked, as seen in the Ineffective Communication section.
Recommendation 6: Review FTE allocated to the teaching program vs. itinerant services to determine appropriate staffing and areas that can flex.
In the Climate and Culture Facilitators section, we see many staff feel overwhelmed by their job responsibilities and discuss related terms to staffing like “substitute” teachers and various staff roles. “Time” is a recurring facilitator identified across roles. So is “pay.”
Recommendation 7: A culture shift is required.
There is currently a pervasive “us versus them” mentality prevailing at all staff levels in the district. District admins are structured in such a way that teaching and support staff are pitted against one another in competition for resources; school staff blame district staff for challenges faced in their roles; school staff feel siloed from one another and only communicate with others in their departments (a recurring request is more planning time with other roles, particularly on behalf of gen ed teachers curious about the special education process).
We also see from the text analysis that accessibility and equity, as frequently recurring terms used by certified support staff and special education teachers, need to be a higher priority in the district given it’s vision and commitment focused on “equity, engagement, and excellence for every student.” “Accessibility” can refer to curriculum in the classroom or simply access to physical spaces as we see “double doors” and accessible “bathrooms” listed as priorities by staff.
References
Analyzing tf-idf results in scikit-learn—Datawerk. (n.d.). Retrieved May 10, 2023, from https://buhrmann.github.io/tfidf-analysis.html
Scott, J. (2021, March 29). What’s in a word? Medium. https://towardsdatascience.com/whats-in-a-word-da7373a8ccb
Scott, W. (2021, September 26). TF-IDF for Document Ranking from scratch in python on real world dataset. Medium. https://towardsdatascience.com/tf-idf-for-document-ranking-from-scratch-in-python-on-real-world-dataset-796d339a4089
Understanding TF-IDF for Machine Learning. (n.d.). Capital One. Retrieved May 10, 2023, from https://www.capitalone.com/tech/machine-learning/understanding-tf-idf/